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Abstract — We explore a basic noise-free signaling 
scenario where coordination and communication are 
naturally merged. A random signal Xi, ...,Xn is pro- 
cessed to produce a control signal or action sequence 
Ai,...,An, which is observed and further processed 
(without access to Xi,...,Xn) to produce a third 
sequence The object of interest is the 

set of empirical joint distributions p{x, a, h) that can 
be achieved in this setting. We show that H{A) > 
I{X:A,B) is the necessary and sufficient condition 
for achieving p{x, a, b) when no causaUty constraints 
are enforced on the encoders. We also give results for 
various causaUty constraints. 

This setting sheds light on the embedding of dig- 
ital information in analog signals, a concept that 
is exploited in digital watermarking, steganography, 
cooperative communication, and strategic play in team 
games such as bridge. 

I. Introduction 

We are interested in examining a simple batch of 
communication questions that obscure the line be- 
tween "analog" control and "digital" communication 
signaling. How well can a signal be used to both 
carry information (digital) and play an explicit role in 
a system (analog)? Suppose a communication signal 
is required to have certain statistical properties and 
correlations with other signals of interest, such as 
in a multiuser communication setting, or consider 
a control signal that is used to carry additional 
embedded information. This sort of dual purpose 
signaling manifests itself naturally in the simple 
communication setting shown in Figure [T] 

A. An "Online Communication" Problem 

Let us begin the discussion with an example from 
the literature. In 2003, Gossner et. al. [1] solved an 
interesting problem involving sequential play of a 
cooperative penny matching game. The game setting 
allows for communication between the players only 
through actions in the game, which they refer to as 



"online communication." The game involves a ran- 
dom binary sequence (the "source") and two players, 
Alice and Bob. Alice knows the source sequence, 
but Bob doesn't. Alice and Bob repeatedly attempt 
to gues^ the source sequence, one bit at a time. 
They obtain one point whenever both of them guess 
correctly. After each guess, they each see the guess 
of the other person and the source bit. As you might 
expect, they are allowed to strategize before the 
source sequence is revealed to Alice, but after the 
game begins they cannot communicate explicitly - 
only implicitly through the game itself. What is the 
best average score that can be achieved? 

Gossner et. al. show that the optimal average score 
of this game is .82, which is significantly better than 
the average score that can be achieved through trivial 
(albeit clever) strategies. (Warning: Spoiler! Pause 
here if you wish to solve this problem on your own.) 
You can achieve this score using techniques from 
communication theory (error-correction codes) and 
information theory. The main ideas are block-Markov 
coding, rate-distortion theory for Hamming distor- 
tion, and input-constrained channel capacity (binary 
channel with no noise). The analysis by Gossner et. 
al. was combinatoric instead of information theoretic. 
They also present a matching upper bound which is 
very specific to the particular game being played. 

A nice surprise related to this game emerges from 
the results of our work. Suppose that the game was 
made more difficult. After each guess. Bob sees the 
guess that AUce made but does not see the source bit 
(nor does he know the score of the game until after 
the game is finished). It turns out that the optimal 
average score of the game is the same! This may be 
surprising because the strategy prescribed by Gossner 
et. al. to achieve optimality requires that Bob consider 

'Alice knows the source sequence, so her "guesses" are always 
correct if she chooses. The optimal strategy will have Alice 
inserting wrong guesses for the sake of communication. 



the past source bits when making his next guess. The 
strategy must be significantly modified in order to 
achieve optimality when Bob does not see the past 
source bits. This observation is not limited to the 
specific repeated game being played. We provide an 
information theoretical solution to general games of 
this form in Section jV] 

II. Uses and Illustrations 

We encounter a variety of situations in signal 
processing and communication where a signal plays 
multiple roles. Perhaps the most relevant to this 
work are those involving network communication. 
In a multiuser joint source-channel coding setting, 
the encoders must structure communication signals 
to convey information about the sources while also 
taking advantage of statistical dependencies of the 
sources to correlate and ahgn the communication 
signals. 

A specific situation where a communication sig- 
nal is used directly and indirectly is the "cribbing" 
transmitters encountered in the work of Van der 
Meulen 121 and Willems El, and more recently 
by Permuter and Asnani (5). Here a multiple access 
channel is considered, but the channel input from 
one transmitter is overheard by the other transmitter, 
allowing them to learn about each other's message 
and cooperate. Here it is discovered that the channel 
input should not only carry information intended 
directly for the other transmitter, but it should also 
be a suitable transmission signal. 

In other examples, there are explicit goals to em- 
bed information in signals, such as digital watermark- 
ing and steganography. Here, a media signal, such as 
video or audio, is augmented to carry information 
in the form of an ID tag or data, which is usually 
intended to be imperceivable to human perception. 
Research exploring the capacity to embed informa- 
tion under signal distortion constraints can be found 
in m, m, IM, and Ii9j. 

Let us now suggest some illustrations of the sce- 
nario we are concerned with in a concrete, though 
playful, manner. 

A. Game of Bridge 

In the game of bridge, players bid for contracts 
which allow them to call trump, pass cards, and 
hopefully earn enough points to validate the contract. 
The bid consists of a number and a suit, indicating 
how many tricks will be won (beyond the defacto six) 



and a suit for trump. However, a player who makes 
a first bid of '1 Clubs' may not be bidding for the 
sake of winning the contract. Instead, the bid might 
be a message to his partner that there is no dominant 
suit in his hand. Communication strategies for bridge 
are limited by the effect they have on the play of the 
game. 

B. Collusion 

High speed stock trading systems make money by 
their precise timing of buying and selling. Suppose 
two trading systems wish to collude in order to shift 
market prices, and they wish to do so in a way 
that is not discoverable over standard communication 
channels. How much can they communicate through 
the timing of their buys and sells without adversely 
affecting their profits? 

C. Multi-part Printing 

Two printers are used to print a color document. 
The first prints all colors, and the second prints black 
only. However, the electronic document for printing 
is sent only to the first printer. The second printer 
scans the color document and adds black where 
needed. The color printer, which mixes three inks 
to create black, can save ink by leaving black for the 
second printer to take care of, but information about 
the location of the black must be written into the 
image somehow. How much ink can be saved? 

III. Cascade of Controllers 

A. Problem Statement 

An i.i.d. random process {Xi} is distributed ac- 
cording to px, which is to say that any finite block 
of symbols is distributed according to 

n 

j=i 

The cascade of controllers shown in Figure [T] 
produces two additional sequences {Ai) and {Bi}. 
The Ai^ are a function of the Xj's and the Bi?, 
are a function of the Aj's, possibly with causality 
constraints. The system runs for a finite but arbitrar- 
ily large number of iterations, n, and we use the 
superscript notation X" to represent the sequence 
Xi,...,Xn- The goal is to coordinate the sequence 
of triples {X, A, B)i with a desired empirical distri- 
bution. 
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Fig. 1. Cascade of controllers. The source of information, X", 
is an i.i.d. sequence with a known distribution. Controller 1 pro- 
duces a control sequence A" which has information embedded 
into it for Controller 2. Without access to the source, Controller 
2 processes A" to produce a control sequence B". 



We characterize the coordination that is achievable 
among the three control signals in terms of the em- 
pirical coordination of lITOl . Under this framework, 
a coordination scheme is summarized by the joint 
distribution that it achieves, in the sense that the 
frequencies of triples {X, A, B)i correspond closely 
with the specified joint distribution with high prob- 
ability. Unlike the problems considered in [10], the 
cascade of controllers setting has no explicit rate- 
limited communication channels. 

To state the criterion for empirical coordination 
formally, a conditional distribution p{a,b\x) can be 
achieved if for all e > there exists an integer n and 
encoding functions / and g (satisfying the necessarily 
causality constraints) such that 

F {\\Px^-,A'\B'A3^,a,b) - Poix)p{a,b\x)\\j,y > e) < e, 

where A" € A^, -B" G S", the in- 
duced empirical distribution a, 6) = 
h Er=i ^{{x„A„BO=ix,a,b)}, and II • Wtv is the total 
variation distance between two distributions. 

The coordination set of all achievable distributions 
for empirical coordination is designated as 

V 



{achievable p{a, b\x)} . 

The main results of this paper are the character- 
izations of the coordination sets in Theorem 14. 1[ 
Theorem 15.11 and Figure [2l 

B. Sequences - An Alternative Statement 

The coordination scenario of this paper is de- 
scribed as controllers acting on signals, providing a 
natural operational meaning. However, the results of 
the analysis in this work are simply statistical and 
probabilistic statements about sequences. Consider 
the set of all groups of random variables X^, A^, 
and B"- having the following two properties. First, 



X" — A^ — B'^ forms a Markov chain. Second, X" is 
an i.i.d. sequence according to pq{x). This is exactly 
the set of random variables that can be produced 
by a cascade of non-causal randomized controllers. 
Theorem 14.11 then relates to the first-order statistics 
of the sequences in this set. 

C. Maximize Average Score 

We can take a different approach to analyzing 
coordination by specifying a reward function for the 
three combined signals. Let the function n(x, a, b) 
be a reward obtained for each occurrence of the 
triple {x,a,b) in the sequence of combined signals 
{X,A,B)i,{X,A,B)2,.... We can then ask for the 
greatest possible average reward under the constraints 
imposed by the cascade of controllers of Figure [T] 
taking the supremum over all choices of block length 
n and controllers. 

It turns out that this analysis is fundamentally the 
same as characterizing the coordination set V. The 
optimal average reward corresponding to the func- 
tion n can be found by maximizing E n(X, A, B) 
over the coordination set of conditional distributions. 
Likewise, the coordination set, being a convex set, is 
fully characterized by the optimal average reward for 
all reward functions 11. This connection is due to the 
close relationship between the average function value 
of a sequence and the empirical distribution. For a 
detailed proof of the relationship, see the discussion 
in Section VI of |[TOl and the proof in Section VII. 

IV. Non-causal Controllers 

Controller 1 and Controller 2 produce signals ac- 
cording to unconstrained non-causal encoding func- 
tions: 
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Theorem 4.1: The coordination set V for the cas- 
cade of controllers in Figure[T]is the set of conditional 
distributions p(a, b\x) such that the joint distribution 
with the source, given by pQ{x)p{a,b\x), satisfies 

H{A) > I{X;A,B). 

A. Achievability 

To efficiently achieve coordination with a cascade 
of controllers, we populated a codebook of {a^^b^) 
pairs. Controller 1 identifies a pair (d",6") in the 
codebook which yields the desired correlation with 



X"-. However, Controller 1 only produces A" = a", 
which is the first half of the codeword. If the code- 
book is small enough, Controller 2 will be able to 
identify which codeword Controller 1 selected based 
only on observing A^. 

Consider a source distribution po{x) and a de- 
sired conditional distribution p{a, b\x) that satis- 
fies H{A) > I{X;A,B). Select a constant r 
such that H{A) > r > I{X;A,B). Let C = 
{(a"(A;), 6"(A;))}^!r^ be a randomly generated code- 
book, where each [a"^ (k) , b"" (k)) is independently 
drawn from the marginal distribution induced by 
Po{x)p{a,b\x). 

Controller 1 finds an integer k such that 
(X", a"(A;), 6"(A;)) is jointly typical (in the sense 
that the empirical joint distribution is close to the 
desired distribution in total variation). This will be 
successful with high probability if n is large enough, 
as a consequence of rate-distortion theory, since 
r > I{X; A, B). Controller 2 searches the codebook 
C for the first j such that a"(j) = A" and produces 
the control sequence i?" = If Controller 1 

was successful, then A^ is a typical sequence, and 
with high probability there is no other codeword in 
the randomly generated codebook equal to A"- since 
r < H{A). 

B. Converse 

This problem does not involve rates of communi- 
cation. The converse rests on the following observa- 
tion. 

n 

= ^I{Xg;A^,B^\X'^-') 

q=l 

= n/(XQ;A",i?"|X«-\Q) 
= nI{XQ-A^,B'\xQ-\Q) 
> nI{XQ;AQ,BQ). 

where (a) comes from the fact that X" — A^ — B^ 
form a Markov chain. Q is a time sharing random 
variable uniformly distributed on {l,...,n} and in- 
dependent of Similarly, 

n 

H{A-) = Y.H{A,\A'^-') 

9=1 

= nH{AQ\A'^-\Q) 
< nH{AQ). 



V. One Causal Controller 

Let us revisit the game Gossner et. al. solved in fl]. 
In their setting. Controller 1 observes the whole X^ 
sequence and then generates an action sequence A^ 
Controller 2 has a sequence of causally constrained 
action functions gi{-) for i = 1, Therefore, the 
controllers act according to the following encoding 
functions: 



B, 



for i = 1, 



, n. 



Theorem 5.1: The coordination set V for the cas- 
cade of controllers in Figure [T] with a strict causality 
constraint on Controller 2 is the set of conditional 
distributions p(a, b\x) such that the joint distribution 
with the source, given by pQ{x)p{a,b\x), satisfies 

H{A\X,B) > I{X;B). 

A. Achievability 

We use block-Markov coding. Each block is of 
length k, and we denote the ith block X"(i). Con- 
sider a joint distribution pQ{x)p{a,b\x) that satisfies 
H{A\X,B) > I{X;B), and select r such that 
H{A\X,B) > r > I{X;B). We generate a code- 
book C of B^ sequences of size 2^'^ according to 
the marginal distribution induced by pQ{x)p{a,b\x) 
to cover X^. We also randomly bin all the typical 
A^ sequences in 2*^^ bins. 

At the beginning of the ith block. Controller 
1 finds an index ji^i in the codebook such that 
B^{jij^i) is jointly typical with X^{i + 1). Controller 
1 then finds an A^ sequence in the ji+ith bin that 
is jointly typical with {X^{i),B^{ji)) and outputs 
that A^ sequence in the ith block. At the end of the 
ith block. Controller 2 observes the A^{i) sequence 
from Controller 1, thus decodes the bin index jj+i. 
In the (i + l)th block. Controller 2 simply outputs 
^'^(ji+i) its actions. This scheme works with high 
probability and yields an empirical distribution close 
to pQ{x)p{a,b\x). 



B. Converse 
nH(X) 



(a) 



H{X'',A'' 



^Technically, Controller 1 also observes Bi,...,Bi-i when 
producing action Ai, but this can be safely ignored. 
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Fig. 2. The coordination set under various delay constraints. 



q=l 

n 

Y,H{X„Ag\X'i-\A'i-\Bg) 

q=l 

n 

< Y.H{X„A,\B,) 

q=l 

= nH{XQ,AQ\BQ,Q) 

< nH{XQ,AQ\BQ), 

where (a) is because A^ is a function of X" and 
(b) is due to the fact that Bg is a function of A'^^^. 
The random variable Q is uniformly distributed on 
the set [n] and independent of {X"- , A"- , B"-} . Note 
that H{X) < H{X, A\B) is equivalent to I{X; B) < 
H{A\X,B). 

Based on Theorem 15.11 and the discussion in Sec- 
tion IIII-CI we can characterize the optimal average 
score of the game in the following corollary: 

Corollary 5.2: For a game that pays out 7r(x, a, b) 
(x represents the source realization, a represents the 
action of Alice, b represents the action of Bob), and 
an i.i.d. source sequence with distribution po{x), the 
optimal average score of the game (assuming Alice 
knows the entire source sequences. Bob sees past 
actions of Alice, and they produce actions simulta- 
neously) is 

max E'ir(X,A,B). 

p(a,b\x) : H{A\X,B)>I(X;B) 

Remark: If we specialize the corollary to the 
case where X ~Bemulli(l/2) and carry out the 
optimization we will recover the optimal score in ifTI . 
Furthermore, the score cannot be improved even if 
Bob is allowed to also see the past source realiza- 
tions (that he has already attempted to guess). The 
converse for Theorem l5.1[ in particular inequality (b), 
still holds. 



VI. Further Extensions 

In general, the encoding functions for both con- 
trollers can be subject to delay constraints, i.e, 

Ai = h{X'-^^), 

B, = gM'""'), 

where di and d2 are the delays. The results under 
different di d2 combinations are listed in Figure |2l 
Note that — oo means non-causal. Section Hill solved 
the case di = —oo and d2 = —oo and Section |V] 
solved the case di = —oo and d2 = I. 
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